Nonparametric Depth-Based Multivariate Outlier Identifiers, and Masking Robustness Properties

نویسندگان

  • Xin Dang
  • Robert Serfling
چکیده

In extending univariate outlier detection methods to higher dimension, various issues arise: limited visualization methods, inadequacy of marginal methods, lack of a natural order, limited parametric modeling, and, when using Mahalanobis distance, restriction to ellipsoidal contours. To address and overcome such limitations, we introduce nonparametric multivariate outlier identifiers based on multivariate depth functions, which can generate contours following the shape of the data set. Also, we study masking robustness, that is, robustness against misidentification of outliers as nonoutliers. In particular, we define a masking breakdown point (MBP), adapting to our setting certain ideas of Davies and Gather (1993) and Becker and Gather (1999) based on the Mahalanobis distance outlyingness. We then compare four affine invariant outlier detection procedures, based on Mahalanobis distance, halfspace or Tukey depth, projection depth, and “Mahalanobis spatial” depth. For the goal of threshold type outlier detection, it is found that the Mahalanobis distance and projection procedures are distinctly superior in performance, each with very high MBP, while the halfspace approach is quite inferior. When a moderate MBP suffices, the Mahalanobis spatial procedure is competitive in view of its contours not constrained to be elliptical and its computational burden relatively mild. A small sampling experiment yields findings completely in accord with the theoretical comparisons. While these four depth procedures are relatively comparable for the purpose of robust affine equivariant location estimation, the halfspace depth is not competitive with the others for the quite different goal of robust setting of an outlyingness threshold. AMS 2000 Subject Classification: Primary 62G10 Secondary 62H99.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonparametric Depth-Based Multivariate Outlier Identifiers, and Robustness Properties

In extending univariate outlier detection methods to higher dimension, various special issues arise, such as limitations of visualization methods, inadequacy of marginal methods, lack of a natural order, limited scope of parametric modeling, and restriction to ellipsoidal contours when using Mahalanobis distance methods. Here we pass beyond these limitations via an approach based on depth funct...

متن کامل

A numerical study of multiple imputation methods using nonparametric multivariate outlier identifiers and depth-based performance criteria with clinical laboratory data

It is well known that if a multivariate outlier has one or more missing component values, then multiple imputation methods tend to impute non-extreme values and make the outlier become less extreme and less likely to be detected. In this paper, nonparametric depthbased multivariate outlier identifiers are used as criteria in a numerical study comparing several established methods of multiple im...

متن کامل

General Foundations for Studying Masking and Swamping Robustness of Outlier Identifiers

With greatly advanced computational resources, the scope of statistical data analysis and modeling has widened to accommodate pressing new arenas of application. In all such data settings, an important and challenging task is the identification of outliers. Especially, an outlier identification procedure must be robust against the possibilities of masking (an outlier is undetected as such) and ...

متن کامل

On Masking and Swamping Robustness of Leading Outlier Identifiers for Univariate Data

In the wide-ranging scope of modern statistical data analysis, a key task is identification of outliers. In using an outlier identification procedure, one needs to know its robustness against masking (an “outlier” is undetected) and swamping (a “nonoutlier” is classified as an “outlier”), possibilities which can come about due to the presence of outliers. Study of these issues together is neces...

متن کامل

Survey on (Some) Nonparametric and Robust Multivariate Methods

Rather than attempt an encyclopedic survey of nonparametric and robust multivariate methods, we limit to a manageable scope by focusing on just two leading and pervasive themes, descriptive statistics and outlier identification. We set the stage with some perspectives, and we conclude with a look at some open issues and directions. A variety of questions are raised. Is nonparametric inference t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009